Require compound terms for typed literal objects by justinjoy · Pull Request #151 · semantic-reasoning/factlog

justinjoy · 2026-06-27T08:29:08Z

Two prompt-hardening changes to skills/factlog/references/text-to-fact.md
(the authoritative extraction criteria), plus a related stale-doc fix in the
factlog init template. All convert a soft "may" into a "must, when X", or
correct an out-of-date capability note.

1. Exhaustive extraction (완전성 원칙)

Dense tables — rosters, financial/registry status, budget line items,
schedules, career/patent records — are the highest-density fact source, yet the
prior criteria only said "record relation candidates." In practice the extractor
skimmed prose and dropped repeated table rows: a real proposal with ~400
extractable facts yielded ~90 (≈20–25% coverage).

forbid sampling of repeated items ("대표 몇 개만" → extract all N)
table → triple mapping rule (row key→subject, header→relation, cell→object)
judge coverage by section/table sweep, not converted-file byte size
pre-finish self-check, PII exclusions preserved

2. Typed-literal compound terms (재량 아님)

Date/amount/ordinal/number objects left as prose strings ("2017.03.08",
"126백만원") can't be sorted/thresholded by the engine. Left to discretion the
extractor never emits compound terms (observed: 0 across a full sync).

require date()/ordinal()/amount()/number() for typed literals, with a
prose→term mapping table
engine-support note (corrected): date/ordinal/number/amount all
project to comparable int64. number is fixed-point scaled ×1000 (3
decimals), so a hand-authored threshold uses scaled integers (V >= 2000,
not 2.0; an unscaled float fails loud). number AND amount are
positive-only, so negative-capable values (e.g. an operating loss) cannot
be made comparable and stay plain strings. amount also needs a unit table.
cross-references attribute-relations.md / typed-relations.md

3. Stale #125 note in the init template

#125 (number-type comparison) is closed/implemented, but
factlog/cli.py's typed-relations.md template still said number was "not yet
engine-projectable", which seeded incorrect guidance into every new KB. Updated
to the fixed-point ×1000 reality. (An earlier revision of this PR's
text-to-fact wording repeated the same stale claim; also corrected here.)

Docs/criteria/template only — no engine code paths touched. The reference file
is read at extraction time, so changes are live without reinstall. Verified:
scaffolded typed-relations.md still parses to {} with no warning;
test_typed_literals.sh (9), test_vocab.sh (18) pass; a number KB answers a
scaled version >= 2.0 comparison and rejects an unscaled float threshold.

Promote the typed-literal guidance from discretionary ("may, if clearer") to directive: dates, amounts, ordinals, and plain numbers MUST be written as compact compound terms (date()/ordinal()/amount()/number()) instead of prose strings. Left to discretion the extractor never emits them, so the engine cannot sort/threshold/range over values that are really comparable. - mapping table prose -> compound term per type - honest engine-support note: date/ordinal fully project; amount needs a unit table and is positive-int only (use number() for negatives); number projection still pending (#125) but emit the term for structure - cross-reference attribute-relations.md / typed-relations.md so declared relations actually project and compare

…ble) #125 (number-type comparison) is CLOSED: `number` projects to a fixed-point int64 scaled ×1000 (literal_types.parse_number_scaled), so date/ordinal/number/ amount all compare. The init template (factlog/cli.py) and this PR's earlier text-to-fact wording still said number was "not yet engine-projectable", which is wrong and was the source of incorrect guidance. Fixes both: - factlog/cli.py typed-relations.md template: number now documented as fixed-point ×1000 int64, positive-only, thresholds in scaled units. - text-to-fact.md: number is projectable; thresholds use scaled integers (`V >= 2000`, not `2.0`); number AND amount reject negatives (verified: parse_number_scaled('-672') -> None), so negative-capable values (e.g. an operating loss) cannot be made comparable and stay plain strings. Verified: a number KB answers `version >= 2.0` (scaled `V >= 2000`) and an unscaled float threshold fails loud via _assert_no_unscaled_number_threshold. Scaffolded typed-relations.md still parses to {} with no warning; typed_literals (9) and vocab (18) tests pass.

justinjoy added 2 commits June 27, 2026 17:28

justinjoy merged commit ab539bf into main Jun 27, 2026
3 checks passed

justinjoy deleted the typed-literal-compound-terms branch June 27, 2026 10:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Require compound terms for typed literal objects#151

Require compound terms for typed literal objects#151
justinjoy merged 2 commits into
mainfrom
typed-literal-compound-terms

justinjoy commented Jun 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

justinjoy commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. Exhaustive extraction (완전성 원칙)

2. Typed-literal compound terms (재량 아님)

3. Stale #125 note in the init template

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

justinjoy commented Jun 27, 2026 •

edited

Loading